Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 242.555
Filtrar
1.
Nat Commun ; 15(1): 3186, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622114

RESUMO

Transcription termination factor ρ is a hexameric, RNA-dependent NTPase that can adopt active closed-ring and inactive open-ring conformations. The Sm-like protein Rof, a homolog of the RNA chaperone Hfq, inhibits ρ-dependent termination in vivo but recapitulation of this activity in vitro has proven difficult and the precise mode of Rof action is presently unknown. Here, our cryo-EM structures of ρ-Rof and ρ-RNA complexes show that Rof undergoes pronounced conformational changes to bind ρ at the protomer interfaces, undercutting ρ conformational dynamics associated with ring closure and occluding extended primary RNA-binding sites that are also part of interfaces between ρ and RNA polymerase. Consistently, Rof impedes ρ ring closure, ρ-RNA interactions and ρ association with transcription elongation complexes. Structure-guided mutagenesis coupled with functional assays confirms that the observed ρ-Rof interface is required for Rof-mediated inhibition of cell growth and ρ-termination in vitro. Bioinformatic analyses reveal that Rof is restricted to Pseudomonadota and that the ρ-Rof interface is conserved. Genomic contexts of rof differ between Enterobacteriaceae and Vibrionaceae, suggesting distinct modes of Rof regulation. We hypothesize that Rof and other cellular anti-terminators silence ρ under diverse, but yet to be identified, stress conditions when unrestrained transcription termination by ρ may be detrimental.


Assuntos
Fator Rho , Fatores de Transcrição , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Fator Rho/química , Transcrição Gênica , RNA/genética , Sítios de Ligação , Regulação Bacteriana da Expressão Gênica , RNA Bacteriano/genética
2.
Int J Mol Sci ; 25(7)2024 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-38612736

RESUMO

The discovery of new genes with novel functions is a major driver of adaptive evolutionary innovation in plants. Especially in woody plants, due to genome expansion, new genes evolve to regulate the processes of growth and development. In this study, we characterized the unique VeA transcription factor family in Populus alba × Populus glandulosa, which is associated with secondary metabolism. Twenty VeA genes were characterized systematically on their phylogeny, genomic distribution, gene structure and conserved motif, promoter binding site, and expression profiling. Furthermore, through ChIP-qPCR, Y1H, and effector-reporter assays, it was demonstrated that PagMYB128 directly regulated PagVeA3 to influence the biosynthesis of secondary metabolites. These results provide a basis for further elucidating the function of VeAs gene in poplar and its genetic regulation mechanism.


Assuntos
Populus , Fatores de Transcrição , Fatores de Transcrição/genética , Populus/genética , Genômica , Sítios de Ligação , Bioensaio
3.
Int J Mol Sci ; 25(7)2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38612757

RESUMO

Wildtype Escherichia coli cells cannot grow on L-1,2-propanediol, as the fucAO operon within the fucose (fuc) regulon is thought to be silent in the absence of L-fucose. Little information is available concerning the transcriptional regulation of this operon. Here, we first confirm that fucAO operon expression is highly inducible by fucose and is primarily attributable to the upstream operon promoter, while the fucO promoter within the 3'-end of fucA is weak and uninducible. Using 5'RACE, we identify the actual transcriptional start site (TSS) of the main fucAO operon promoter, refuting the originally proposed TSS. Several lines of evidence are provided showing that the fucAO locus is within a transcriptionally repressed region on the chromosome. Operon activation is dependent on FucR and Crp but not SrsR. Two Crp-cAMP binding sites previously found in the regulatory region are validated, where the upstream site plays a more critical role than the downstream site in operon activation. Furthermore, two FucR binding sites are identified, where the downstream site near the first Crp site is more important than the upstream site. Operon transcription relies on Crp-cAMP to a greater degree than on FucR. Our data strongly suggest that FucR mainly functions to facilitate the binding of Crp to its upstream site, which in turn activates the fucAO promoter by efficiently recruiting RNA polymerase.


Assuntos
Escherichia coli , Fucose , Sítios de Ligação , Escherichia coli/genética , Óperon/genética , Fosforilação
4.
Nat Commun ; 15(1): 3146, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605029

RESUMO

Despite their lack of a defined 3D structure, intrinsically disordered regions (IDRs) of proteins play important biological roles. Many IDRs contain short linear motifs (SLiMs) that mediate protein-protein interactions (PPIs), which can be regulated by post-translational modifications like phosphorylation. 20% of pathogenic missense mutations are found in IDRs, and understanding how such mutations affect PPIs is essential for unraveling disease mechanisms. Here, we employ peptide-based interaction proteomics to investigate 36 disease-associated mutations affecting phosphorylation sites. Our results unveil significant differences in interactomes between phosphorylated and non-phosphorylated peptides, often due to disrupted phosphorylation-dependent SLiMs. We focused on a mutation of a serine phosphorylation site in the transcription factor GATAD1, which causes dilated cardiomyopathy. We find that this phosphorylation site mediates interaction with 14-3-3 family proteins. Follow-up experiments reveal the structural basis of this interaction and suggest that 14-3-3 binding affects GATAD1 nucleocytoplasmic transport by masking a nuclear localisation signal. Our results demonstrate that pathogenic mutations of human phosphorylation sites can significantly impact protein-protein interactions, offering insights into potential molecular mechanisms underlying pathogenesis.


Assuntos
Proteínas Intrinsicamente Desordenadas , Peptídeos , Humanos , Fosforilação , Peptídeos/metabolismo , Processamento de Proteína Pós-Traducional , Regulação da Expressão Gênica , Mutação , Proteínas Intrinsicamente Desordenadas/metabolismo , Ligação Proteica , Sítios de Ligação , Proteínas do Olho/genética
5.
Science ; 384(6691): 106-112, 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38574125

RESUMO

The de novo design of small molecule-binding proteins has seen exciting recent progress; however, high-affinity binding and tunable specificity typically require laborious screening and optimization after computational design. We developed a computational procedure to design a protein that recognizes a common pharmacophore in a series of poly(ADP-ribose) polymerase-1 inhibitors. One of three designed proteins bound different inhibitors with affinities ranging from <5 nM to low micromolar. X-ray crystal structures confirmed the accuracy of the designed protein-drug interactions. Molecular dynamics simulations informed the role of water in binding. Binding free energy calculations performed directly on the designed models were in excellent agreement with the experimentally measured affinities. We conclude that de novo design of high-affinity small molecule-binding proteins with tuned interaction energies is feasible entirely from computation.


Assuntos
Desenho de Fármacos , Inibidores de Poli(ADP-Ribose) Polimerases , Proteínas , Sítios de Ligação , Desenho de Fármacos/métodos , Ligantes , Simulação de Dinâmica Molecular , Inibidores de Poli(ADP-Ribose) Polimerases/química , Inibidores de Poli(ADP-Ribose) Polimerases/farmacologia , Ligação Proteica , Proteínas/química , Humanos
6.
Sci Rep ; 14(1): 7749, 2024 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565703

RESUMO

DPP4 inhibitors can control glucose homeostasis by increasing the level of GLP-1 incretins hormone due to dipeptidase mimicking. Despite the potent effects of DPP4 inhibitors, these compounds cause unwanted toxicity attributable to their effect on other enzymes. As a result, it seems essential to find novel and DPP4 selective compounds. In this study, we introduce a potent and selective DPP4 inhibitor via structure-based virtual screening, molecular docking, molecular dynamics simulation, MM/PBSA calculations, DFT analysis, and ADMET profile. The screened compounds based on similarity with FDA-approved DPP4 inhibitors were docked towards the DPP4 enzyme. The compound with the highest docking score, ZINC000003015356, was selected. For further considerations, molecular docking studies were performed on selected ligands and FDA-approved drugs for DPP8 and DPP9 enzymes. Molecular dynamics simulation was run during 200 ns and the analysis of RMSD, RMSF, Rg, PCA, and hydrogen bonding were performed. The MD outputs showed stability of the ligand-protein complex compared to available drugs in the market. The total free binding energy obtained for the proposed DPP4 inhibitor was more negative than its co-crystal ligand (N7F). ZINC000003015356 confirmed the role of the five Lipinski rule and also, have low toxicity parameter according to properties. Finally, DFT calculations indicated that this compound is sufficiently soft.


Assuntos
Inibidores da Dipeptidil Peptidase IV , Simulação de Dinâmica Molecular , Inibidores da Dipeptidil Peptidase IV/farmacologia , Simulação de Acoplamento Molecular , Sítios de Ligação , Dipeptidil Peptidase 4 , Teoria da Densidade Funcional , Ligantes
7.
Methods Mol Biol ; 2797: 67-90, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38570453

RESUMO

Molecular docking is a popular computational tool in drug discovery. Leveraging structural information, docking software predicts binding poses of small molecules to cavities on the surfaces of proteins. Virtual screening for ligand discovery is a useful application of docking software. In this chapter, using the enigmatic KRAS protein as an example system, we endeavor to teach the reader about best practices for performing molecular docking with UCSF DOCK. We discuss methods for virtual screening and docking molecules on KRAS. We present the following six points to optimize our docking setup for prosecuting a virtual screen: protein structure choice, pocket selection, optimization of the scoring function, modification of sampling spheres and sampling procedures, choosing an appropriate portion of chemical space to dock, and the choice of which top scoring molecules to pick for purchase.


Assuntos
Algoritmos , Proteínas Proto-Oncogênicas p21(ras) , Simulação de Acoplamento Molecular , Proteínas Proto-Oncogênicas p21(ras)/genética , Proteínas Proto-Oncogênicas p21(ras)/metabolismo , Software , Proteínas/química , Descoberta de Drogas , Ligantes , Ligação Proteica , Sítios de Ligação
8.
Methods Mol Biol ; 2797: 115-124, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38570456

RESUMO

Fragment-based screening by ligand-observed 1D NMR and binding interface mapping by protein-observed 2D NMR are popular methods used in drug discovery. These methods allow researchers to detect compound binding over a wide range of affinities and offer a simultaneous assessment of solubility, purity, and chemical formula accuracy of the target compounds and the 15N-labeled protein when examined by 1D and 2D NMR, respectively. These methods can be applied for screening fragment binding to the active (GMPPNP-bound) and inactive (GDP-bound) states of oncogenic KRAS mutants.


Assuntos
Descoberta de Drogas , Proteínas Proto-Oncogênicas p21(ras) , Proteínas Proto-Oncogênicas p21(ras)/genética , Ligantes , Espectroscopia de Ressonância Magnética , Proteínas , Ligação Proteica , Sítios de Ligação
9.
Database (Oxford) ; 20242024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38557634

RESUMO

The rapid growth in the number of experimental and predicted protein structures and more complicated protein structures poses a significant challenge for computational biology in leveraging structural information and accurate representation of protein surface properties. Recently, AlphaFold2 released the comprehensive proteomes of various species, and protein surface property representation plays a crucial role in protein-molecule interaction predictions, including those involving proteins, nucleic acids and compounds. Here, we proposed the first extensive database, namely ProNet DB, that integrates multiple protein surface representations and RNA-binding landscape for 326 175 protein structures. This collection encompasses the 16 model organism proteomes from the AlphaFold Protein Structure Database and experimentally validated structures from the Protein Data Bank. For each protein, ProNet DB provides access to the original protein structures along with the detailed surface property representations encompassing hydrophobicity, charge distribution and hydrogen bonding potential as well as interactive features such as the interacting face and RNA-binding sites and preferences. To facilitate an intuitive interpretation of these properties and the RNA-binding landscape, ProNet DB incorporates visualization tools like Mol* and an Online 3D Viewer, allowing for the direct observation and analysis of these representations on protein surfaces. The availability of pre-computed features enables instantaneous access for users, significantly advancing computational biology research in areas such as molecular mechanism elucidation, geometry-based drug discovery and the development of novel therapeutic approaches. Database URL:  https://proj.cse.cuhk.edu.hk/aihlab/pronet/.


Assuntos
Proteoma , RNA , Sítios de Ligação , Bases de Dados de Proteínas , RNA/química , Proteínas de Membrana , Propriedades de Superfície
10.
Genome Biol ; 25(1): 83, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38566111

RESUMO

BACKGROUND: The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS: Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS: Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes.


Assuntos
DNA , Sequências Reguladoras de Ácido Nucleico , Sítios de Ligação , Alinhamento de Sequência , Algoritmos , Sequência Conservada/genética , Evolução Molecular
11.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38561176

RESUMO

MOTIVATION: Understanding the intermolecular interactions of ligand-target pairs is key to guiding the optimization of drug research on cancers, which can greatly mitigate overburden workloads for wet labs. Several improved computational methods have been introduced and exhibit promising performance for these identification tasks, but some pitfalls restrict their practical applications: (i) first, existing methods do not sufficiently consider how multigranular molecule representations influence interaction patterns between proteins and compounds; and (ii) second, existing methods seldom explicitly model the binding sites when an interaction occurs to enable better prediction and interpretation, which may lead to unexpected obstacles to biological researchers. RESULTS: To address these issues, we here present DrugMGR, a deep multigranular drug representation model capable of predicting binding affinities and regions for each ligand-target pair. We conduct consistent experiments on three benchmark datasets using existing methods and introduce a new specific dataset to better validate the prediction of binding sites. For practical application, target-specific compound identification tasks are also carried out to validate the capability of real-world compound screen. Moreover, the visualization of some practical interaction scenarios provides interpretable insights from the results of the predictions. The proposed DrugMGR achieves excellent overall performance in these datasets, exhibiting its advantages and merits against state-of-the-art methods. Thus, the downstream task of DrugMGR can be fine-tuned for identifying the potential compounds that target proteins for clinical treatment. AVAILABILITY AND IMPLEMENTATION: https://github.com/lixiaokun2020/DrugMGR.


Assuntos
Proteínas , Ligantes , Proteínas/química , Sítios de Ligação
12.
BMC Bioinformatics ; 25(1): 158, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643066

RESUMO

BACKGROUND: Motif finding in Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) data is essential to reveal the intricacies of transcription factor binding sites (TFBSs) and their pivotal roles in gene regulation. Deep learning technologies including convolutional neural networks (CNNs) and graph neural networks (GNNs), have achieved success in finding ATAC-seq motifs. However, CNN-based methods are limited by the fixed width of the convolutional kernel, which makes it difficult to find multiple transcription factor binding sites with different lengths. GNN-based methods has the limitation of using the edge weight information directly, makes it difficult to aggregate the neighboring nodes' information more efficiently when representing node embedding. RESULTS: To address this challenge, we developed a novel graph attention network framework named MMGAT, which employs an attention mechanism to adjust the attention coefficients among different nodes. And then MMGAT finds multiple ATAC-seq motifs based on the attention coefficients of sequence nodes and k-mer nodes as well as the coexisting probability of k-mers. Our approach achieved better performance on the human ATAC-seq datasets compared to existing tools, as evidenced the highest scores on the precision, recall, F1_score, ACC, AUC, and PRC metrics, as well as finding 389 higher quality motifs. To validate the performance of MMGAT in predicting TFBSs and finding motifs on more datasets, we enlarged the number of the human ATAC-seq datasets to 180 and newly integrated 80 mouse ATAC-seq datasets for multi-species experimental validation. Specifically on the mouse ATAC-seq dataset, MMGAT also achieved the highest scores on six metrics and found 356 higher-quality motifs. To facilitate researchers in utilizing MMGAT, we have also developed a user-friendly web server named MMGAT-S that hosts the MMGAT method and ATAC-seq motif finding results. CONCLUSIONS: The advanced methodology MMGAT provides a robust tool for finding ATAC-seq motifs, and the comprehensive server MMGAT-S makes a significant contribution to genomics research. The open-source code of MMGAT can be found at https://github.com/xiaotianr/MMGAT , and MMGAT-S is freely available at https://www.mmgraphws.com/MMGAT-S/ .


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Genômica , Humanos , Animais , Camundongos , Sítios de Ligação , Ligação Proteica , Genômica/métodos , Cromatina/genética , Fatores de Transcrição/metabolismo
13.
Nat Commun ; 15(1): 3193, 2024 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-38609371

RESUMO

RNA polymerases must transit through protein roadblocks to produce full-length transcripts. Here we report real-time measurements of Escherichia coli RNA polymerase passing through different barriers. As intuitively expected, assisting forces facilitated, and opposing forces hindered, RNA polymerase passage through lac repressor protein bound to natural binding sites. Force-dependent differences were significant at magnitudes as low as 0.2 pN and were abolished in the presence of the transcript cleavage factor GreA, which rescues backtracked RNA polymerase. In stark contrast, opposing forces promoted passage when the rate of RNA polymerase backtracking was comparable to, or faster than the rate of dissociation of the roadblock, particularly in the presence of GreA. Our experiments and simulations indicate that RNA polymerase may transit after roadblocks dissociate, or undergo cycles of backtracking, recovery, and ramming into roadblocks to pass through. We propose that such reciprocating motion also enables RNA polymerase to break protein-DNA contacts that hold RNA polymerase back during promoter escape and RNA chain elongation. This may facilitate productive transcription in vivo.


Assuntos
RNA Polimerases Dirigidas por DNA , Transcrição Gênica , RNA Polimerases Dirigidas por DNA/genética , Sítios de Ligação , Escherichia coli/genética , Repressores Lac
14.
Biochemistry ; 63(8): 1038-1050, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38577885

RESUMO

The ethylene-forming enzyme (EFE) is an Fe(II), 2-oxoglutarate (2OG), and l-arginine (l-Arg)-dependent oxygenase that either forms ethylene and three CO2/bicarbonate from 2OG or couples the decarboxylation of 2OG to C5 hydroxylation of l-Arg. l-Arg binds with C5 toward the metal center, causing 2OG to change from monodentate to chelate metal interaction and OD1 to OD2 switch of D191 metal coordination. We applied anaerobic UV-visible spectroscopy, X-ray crystallography, and computational approaches to three EFE systems with high-resolution structures. The ineffective l-Arg analogue l-canavanine binds to the EFE with O5 pointing away from the metal center while promoting chelate formation by 2OG but fails to switch the D191 metal coordination from OD1 to OD2. Substituting alanine for R171 that interacts with 2OG and l-Arg inactivates the protein, prevents metal chelation by 2OG, and weakens l-Arg binding. The R171A EFE had electron density at the 2OG binding site that was identified by mass spectrometry as benzoic acid. The substitution by alanine of Y306 in the EFE, a residue 12 Å away from the catalytic metal center, generates an interior cavity that leads to multiple local and distal structural changes that reduce l-Arg binding and significantly reduce the enzyme activity. Flexibility analyses revealed correlated and anticorrelated motions in each system, with important distinctions from the wild-type enzyme. In combination, the results are congruent with the currently proposed enzyme mechanism, reinforce the importance of metal coordination by OD2 of D191, and highlight the importance of the second coordination sphere and longer range interactions in promoting EFE activity.


Assuntos
Canavanina , Compostos Ferrosos , Liases , Compostos Ferrosos/metabolismo , Sítios de Ligação , Alanina , Ácidos Cetoglutáricos/metabolismo
15.
BMC Genomics ; 25(1): 377, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38632500

RESUMO

BACKGROUND: Deciphering gene regulation is essential for understanding the underlying mechanisms of healthy and disease states. While the regulatory networks formed by transcription factors (TFs) and their target genes has been mostly studied with relation to cis effects such as in TF binding sites, we focused on trans effects of TFs on the expression of their transcribed genes and their potential mechanisms. RESULTS: We provide a comprehensive tissue-specific atlas, spanning 49 tissues of TF variations affecting gene expression through computational models considering two potential mechanisms, including combinatorial regulation by the expression of the TFs, and by genetic variants within the TF. We demonstrate that similarity between tissues based on our discovered genes corresponds to other types of tissue similarity. The genes affected by complex TF regulation, and their modelled TFs, were highly enriched for pharmacogenomic functions, while the TFs themselves were also enriched in several cancer and metabolic pathways. Additionally, genes that appear in multiple clusters are enriched for regulation of immune system while tissue clusters include cluster-specific genes that are enriched for biological functions and diseases previously associated with the tissues forming the cluster. Finally, our atlas exposes multilevel regulation across multiple tissues, where TFs regulate other TFs through the two tested mechanisms. CONCLUSIONS: Our tissue-specific atlas provides hierarchical tissue-specific trans genetic regulations that can be further studied for association with human phenotypes.


Assuntos
Regulação da Expressão Gênica , Fatores de Transcrição , Humanos , Fatores de Transcrição/metabolismo , Sítios de Ligação , Ligação Proteica , Redes Reguladoras de Genes
16.
BMC Med Genomics ; 17(Suppl 1): 92, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38632583

RESUMO

BACKGROUND: Repressor element 1 (RE1) silencing transcription factor (REST) is a transcriptional repressor abundantly expressed in aging human brains. It is known to regulate genes associated with oxidative stress, inflammation, and neurological disorders by binding to a canonical form of sequence motif and its non-canonical variations. Although analysis of genomic sequence motifs is crucial to understand transcriptional regulation by transcription factors (TFs), a comprehensive characterization of various forms of RE1 motifs in human cell lines has not been performed. RESULTS: Here, we analyzed 23 ENCODE REST ChIP-seq datasets from diverse human cell lines and identified a non-redundant set of 68,975 loci with ChIP-seq peaks. Our systematic characterization of these binding sites revealed that the canonical form of REST binding motif was found primarily in ChIP-seq peaks shared across multiple cell lines, while non-canonical forms of motifs were identified in both cell-line-specific binding sites and those shared across cell lines. Remarkably, we observed a notable prevalence of non-canonical motifs that corresponded to half segments of the canonical motif. Furthermore, our analysis unveiled the presence of cell-line-specific REST binding patterns, as evidenced by the clustering of ChIP-seq experiments according to their respective cell lines. This observation underscores the cell-line specificity of REST binding at certain genomic loci, implying intricate cell-line-specific regulatory mechanisms. CONCLUSIONS: Overall, our study provides a comprehensive characterization of REST binding motifs in human cell lines and genome-wide RE1 motif profiles. These findings contribute to a deeper understanding of REST-mediated transcriptional regulation and highlight the importance of considering cell-line-specific effects in future investigations.


Assuntos
Regulação da Expressão Gênica , Fatores de Transcrição , Humanos , Fatores de Transcrição/genética , Linhagem Celular , Genômica , Sítios de Ligação
17.
BMC Bioinformatics ; 25(1): 156, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38641811

RESUMO

BACKGROUND: Accurately identifying drug-target interaction (DTI), affinity (DTA), and binding sites (DTS) is crucial for drug screening, repositioning, and design, as well as for understanding the functions of target. Although there are a few online platforms based on deep learning for drug-target interaction, affinity, and binding sites identification, there is currently no integrated online platforms for all three aspects. RESULTS: Our solution, the novel integrated online platform Drug-Online, has been developed to facilitate drug screening, target identification, and understanding the functions of target in a progressive manner of "interaction-affinity-binding sites". Drug-Online platform consists of three parts: the first part uses the drug-target interaction identification method MGraphDTA, based on graph neural networks (GNN) and convolutional neural networks (CNN), to identify whether there is a drug-target interaction. If an interaction is identified, the second part employs the drug-target affinity identification method MMDTA, also based on GNN and CNN, to calculate the strength of drug-target interaction, i.e., affinity. Finally, the third part identifies drug-target binding sites, i.e., pockets. The method pt-lm-gnn used in this part is also based on GNN. CONCLUSIONS: Drug-Online is a reliable online platform that integrates drug-target interaction, affinity, and binding sites identification. It is freely available via the Internet at http://39.106.7.26:8000/Drug-Online/ .


Assuntos
Aprendizado Profundo , Interações Medicamentosas , Sítios de Ligação , Sistemas de Liberação de Medicamentos , Avaliação Pré-Clínica de Medicamentos
18.
Elife ; 132024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38630609

RESUMO

Revealing protein binding sites with other molecules, such as nucleic acids, peptides, or small ligands, sheds light on disease mechanism elucidation and novel drug design. With the explosive growth of proteins in sequence databases, how to accurately and efficiently identify these binding sites from sequences becomes essential. However, current methods mostly rely on expensive multiple sequence alignments or experimental protein structures, limiting their genome-scale applications. Besides, these methods haven't fully explored the geometry of the protein structures. Here, we propose GPSite, a multi-task network for simultaneously predicting binding residues of DNA, RNA, peptide, protein, ATP, HEM, and metal ions on proteins. GPSite was trained on informative sequence embeddings and predicted structures from protein language models, while comprehensively extracting residual and relational geometric contexts in an end-to-end manner. Experiments demonstrate that GPSite substantially surpasses state-of-the-art sequence-based and structure-based approaches on various benchmark datasets, even when the structures are not well-predicted. The low computational cost of GPSite enables rapid genome-scale binding residue annotations for over 568,000 sequences, providing opportunities to unveil unexplored associations of binding sites with molecular functions, biological processes, and genetic variants. The GPSite webserver and annotation database can be freely accessed at https://bio-web1.nscc-gz.cn/app/GPSite.


Assuntos
Aprendizado Profundo , Ligação Proteica , Proteínas/metabolismo , Sítios de Ligação , Peptídeos/metabolismo
19.
Elife ; 122024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38567819

RESUMO

Based on experimentally determined average inter-origin distances of ~100 kb, DNA replication initiates from ~50,000 origins on human chromosomes in each cell cycle. The origins are believed to be specified by binding of factors like the origin recognition complex (ORC) or CTCF or other features like G-quadruplexes. We have performed an integrative analysis of 113 genome-wide human origin profiles (from five different techniques) and five ORC-binding profiles to critically evaluate whether the most reproducible origins are specified by these features. Out of ~7.5 million union origins identified by all datasets, only 0.27% (20,250 shared origins) were reproducibly obtained in at least 20 independent SNS-seq datasets and contained in initiation zones identified by each of three other techniques, suggesting extensive variability in origin usage and identification. Also, 21% of the shared origins overlap with transcriptional promoters, posing a conundrum. Although the shared origins overlap more than union origins with constitutive CTCF-binding sites, G-quadruplex sites, and activating histone marks, these overlaps are comparable or less than that of known transcription start sites, so that these features could be enriched in origins because of the overlap of origins with epigenetically open, promoter-like sequences. Only 6.4% of the 20,250 shared origins were within 1 kb from any of the ~13,000 reproducible ORC-binding sites in human cancer cells, and only 4.5% were within 1 kb of the ~11,000 union MCM2-7-binding sites in contrast to the nearly 100% overlap in the two comparisons in the yeast, Saccharomyces cerevisiae. Thus, in human cancer cell lines, replication origins appear to be specified by highly variable stochastic events dependent on the high epigenetic accessibility around promoters, without extensive overlap between the most reproducible origins and currently known ORC- or MCM-binding sites.


Assuntos
Complexo de Reconhecimento de Origem , Proteínas de Saccharomyces cerevisiae , Humanos , Complexo de Reconhecimento de Origem/genética , Complexo de Reconhecimento de Origem/metabolismo , Origem de Replicação/genética , Sítios de Ligação , Replicação do DNA/genética , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Cromossomos Humanos/metabolismo , DNA/metabolismo , Proteínas de Ciclo Celular/metabolismo
20.
Sci Rep ; 14(1): 9058, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643174

RESUMO

Activity cliffs (ACs) are pairs of structurally similar molecules with significantly different affinities for a biotarget, posing a challenge in computer-assisted drug discovery. This study focuses on protein kinases, significant therapeutic targets, with some exhibiting ACs while others do not despite numerous inhibitors. The hypothesis that the presence of ACs is dependent on the target protein and its complete structural context is explored. Machine learning models were developed to link protein properties to ACs, revealing specific tripeptide sequences and overall protein properties as critical factors in ACs occurrence. The study highlights the importance of considering the entire protein matrix rather than just the binding site in understanding ACs. This research provides valuable insights for drug discovery and design, paving the way for addressing ACs-related challenges in modern computational approaches.


Assuntos
Descoberta de Drogas , Inibidores de Proteínas Quinases , Relação Estrutura-Atividade , Sítios de Ligação , Domínios Proteicos , Inibidores de Proteínas Quinases/farmacologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...